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Chapter  I 
Introduction 


One  of  the  most  challenging  problems  in  the  Medical  field  today  is  the  early 
detection  of  tumors.  Whenever  a  cancerous  tumor  is  not  detected  in  time,  its  treatment 
becomes  difficult  if  not  impossible.  In  an  attempt  to  assist  in  the  solution  of  the  problem, 
a  signal  processing  procedure  referred  to  as  die  Automatic  Statistical  Characterization 
And  Partitioning  of  Environment  (A'SCAPE)  [1]  was  utilized.  A'SCAPE  was 
successfully  used  to  characterize  infrared  (IR)  and  radar  land  scenes  as  a  target  pre¬ 
detection  stage  (DoD  application)  [1],  and,  in  another  application  (law  enforcement),  to 
detect  weapons  concealed  underneath  clothing  in  scenes  collected  by  different  types  of 
sensors  including  IR  and  Millimeter-wave  sensors  [1]. 

In  this  project,  the  prime  emphasis  was  placed  on  structuring  A'SCAPE  to  (I) 
detect  tumors  in  lung  tissue,  and  (2)  classify  a  particular  tumor  as  being  either  benign  or 
malignant  using  computed  tomography  (CT)  data.  It  is  shown  that  A'SCAPE  (1)  can 
successfully  highlight  suspected  tumor  tissue  within  the  (CT)  lung  image,  and  (2)  has  the 
potential  to  detect  and  classify  tumors.  The  latter  point  can  be  proven  however  only  if 
more  data  of  both  benign  and  malignant  tumors  is  made  available.  During  this  effort,  a 
library  of  only  5  usable  data  sets  were  available.  Two  of  these  sets  are  used  to  develop 
rules  and  the  remaining  three  are  used  to  test  the  rules.  However,  the  results  demonstrate 
promise  for  future  application  of  this  approach. 

The  A'SCAPE  procedure  is  presented  in  Chapter  II.  In  Chapter  III,  A'SCAPE  is 
applied  to  5  different  cases  of  cancerous  lungs.  Processing  and  results  are  provided  in  the 
same  chapter.  A  conclusion  and  a  set  of  recommendations  for  future  work  are  given  in 
chapter  IV. 
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Chapter  II 
A*  SCAPE  Procedure 


2.1  -  Introduction 

The  goal  of  this  work  is  to  enhance  and  detect  tumors  in  the  data  collected  by  CT 
scans.  Enhancement  of  the  tumor  area  constitutes  a  de-cluttering  problem  where  the 
objects  other  than  the  tumors  are  considered  as  clutter.  In  this  work,  two  methods  are 
used  for  this  purpose.  First,  based  on  the  difference  between  the  average  power  levels  of 
the  different  regions  in  a  scene,  a  mapping  procedure  [2]  is  used  to  isolate  the 
homogeneous  regions.  This  stage  is  needed  to  clean  the  image  from  unneeded  isolated 
cells  and  to  determine  the  regions  that  can  be  suspected  of  containing  tumors.  The 
procedure  has  the  potential  to  separate  between  regions  of  different  average  power  levels 
even  when  the  power  levels  are  very  close.  On  the  other  hand,  for  those  regions  that  are 
contiguous  and  non-homogeneous  but  with  similar  average  power  levels,  a  statistical 
procedure  [3]  is  used  to  separate  them.  The  procedure  is  a  state  of  the  art  method  that 
groups  data  with  similar  statistical  distributions  into  subsets  called  regions.  By  doing  so, 
different  regions  are  obtained;  each  with  a  corresponding  probability  density  function 
(PDF)  that  would  best  approximate  the  statistical  distribution  of  the  data  in  the 
corresponding  region.  Each  group  of  data  defines  a  homogeneous  region.  Both,  the 
Mapping  and  Statistical  procedures  are  part  of  the  Automated  Statistical  Characterization 
and  Partitioning  of  the  Environment  (A’SCAPE)  [1]  shown  in  Figure  2.1. 

Both,  the  Mapping  and  Statistical  procedures  are  presented  next. 


2.2  -  Mapping  Procedure 

Assuming  that  a  scene  consists  of  a  set  of  patches,  let  LP  refer  to  the  patch  with 
the  lowest  average  magnitude  and  let  also  RPs  refer  to  the  remaining  patches.  Using  the 
fact  that  RPs  patches,  on  average,  have  stronger  magnitudes,  the  mapping  procedure 
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begins  by  setting  a  threshold  that  results  in  a  specified  fi’action  of  LP  pixels.  Image 
processing  is  then  used  to  establish  the  LP  and  RPs  patches.  If  the  final  image  contains  a 
significantly  different  fraction  of  LP  than  originally  established  by  the  initial  threshold, 
the  process  is  repeated  with  a  new  threshold.  The  mapping  procedure  iterates  until  it  is 
satisfied  that  the  final  scene  is  consistent  with  the  previous  specified  threshold.  Finally, 
edges  of  all  patches  are  detected  using  an  image  processing  technique  referred  to  as  the 
"imsharp  masking". 

As  a  result  and  as  shown  in  Figure  2.2,  the  mapping  procedure  consists  of  two 
stages.  In  the  first  stage,  the  patch  with  the  lowest  average  power,  LP,  among  all 
remaining  patches,  RPs  is  identified.  In  the  second  stage,  edges  of  the  LP  are  enhanced 
and  detected. 

These  two  stages  are  repeated  to  identify  the  next  LP  and  so  on.  The  mapping 
procedure  is  repeated  continuously  until  it  is  not  possible  to  further  separate  between 
patches,  and  all  patches  are  declared  to  be  homogeneous.  Once  all  patches  have  been 
found,  every  patch  is  processed  by  the  mapping  procedure,  as  discussed  above,  for 
detection  of  subpatches. 

The  Mapping  procedure  as  described  in  [2]  is  modified  to  simultaneously 
compute  all  thresholds  that  would  separate  between  the  identified  regions  in  a  scene.  As 
shown  in  Figure  2.3,  this  step  is  performed  by  the  block  referred  to  as  the  Automatic 
Thresholds  Computation  (ATC).  Then,  for  each  threshold  the  Mapping  procedure:  (1) 
quantizes  the  scene  at  that  threshold,  (2)  extracts  the  different  regions  through  a  low  pass 
filter,  and  (3)  extracts  the  edges  corresponding  to  the  different  regions  using  a  high  pass 
filter.  The  resulting  scene  is  referred  to  as  the  component  image.  These  steps  are  repeated 
for  each  threshold.  At  the  end,  the  component  images  are  combined  to  form  a  composite 
image  which  will  be  equivalent  to  the  original  image  but  would  have  the  different  regions 
enhanced  and  represented  visually  by  different  colors.  One  or  more  of  these  regions 
would  represent  tumor(s).  Note  that  if  the  data  in  the  original  scene  consists  of  8  bits,  it 
then  contains  256  color  levels.  On  the  other  hand,  the  composite  image  consists  of  only  N 
color  levels  where  N  is  the  number  of  thresholds  found  by  the  ATC  stage. 
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Figure  2.1  —  Block  Diagram  of  A'SCAPE 
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Figure  2.3  -  Block  Diagram  of  the  Modified  Mapping  Procedure 
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2.3  -  Statistical  Procedure 


2.3.1  -  Introduction 

When  it  is  not  possible  to  separate  between  contiguous  non-homogeneous  regions 
based  on  power  levels,  separation  between  the  regions  can  be  obtained  using  the  data 
distribution  in  each  region.  Specifically,  using  the  Statistical  procedure  of  A'SCAPE  [3], 
one  can  investigate  the  PDF(s)  that  can  approximate  the  data  in  the  scene  and  based  on 
the  outcome  decide  whether  the  scene  is  homogeneous.  And,  if  it  is  not,  determine  the 
different  homogeneous  regions  and  their  boundaries.  The  Statistical  procedure  of 
A'SCAPE  is  based  on  the  Ozturk  Algorithm  which  is  presented  next  followed  by  the 
details  of  the  Statistical  Procedure. 

2.3.2  -  Ozturk  Algorithm  [4] 

In  signal  processing  applications  it  is  common  to  assume  a  Gaussian  problem  in 
the  design  of  optimal  signal  processors.  However,  non-Gaussian  processes  do  arise  in 
many  situations.  When  the  possibility  of  a  non-Gaussian  problem  is  encountered,  the 
question  as  to  which  probability  distributions  should  be  utilized  in  a  specific  situation  for 
modeling  the  data  needs  to  be  answered.  In  practice,  the  underlying  probability 
distributions  are  not  known  a  priori  and,  therefore,  have  to  be  determined  experimentally 
by  analyzing  the  random  data. 

Classical  techniques  fail  to  approximate  the  underlying  probability  distributions 
for  a  given  set  of  data.  They  only  perform  goodness-of-fit  tests  by  providing  an  answer  to 
the  question  “Is  the  set  of  random  data  statistically  consistent  with  a  specified  distribution 
to  within  a  desired  confidence  level  ?”.  Furthermore,  these  techniques  require  a  large 
number  of  samples  (typically  several  thousands)  to  perform  the  test. 

On  the  other  hand,  the  Ozturk  algorithm  is  a  new  statistical  algorithm  capable  of 
approximating  the  probability  distribution  function  (PDF)  of  a  set  of  random  data  using 
only  100  statistically  independent  sample  points.  It  consists  of  two  modes:  the  goodness- 
of-fit  test  mode  and  the  PDF  approximation  mode.  The  first  mode  determines  whether  a 
sample  data  set  is  statistically  consistent  with  a  pre-specified  PDF.  The  second  mode 
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selects  the  ‘best’  approximate  PDF  from  a  variety  of  PDFs  and  is  simply  an  extension  of 
the  goodness-of-fit  test. 

The  two  modes  are  described  in  detail  next. 

2.3.2. a  -  Goodness-of-fit  Test  Mode 

The  goodness-of-fit  test  is  an  empirical  algorithm  which  determines  if  the  sample 
data  is  statistically  consistent  with  a  given  distribution,  called  the  null  hypothesis,  to 
within  a  desired  confidence  level.  In  the  Ozturk  algorithm,  linked  vectors  are  constructed 
for  both  the  null  hypothesis  and  the  sample  data  set.  The  confidence  contours  are 
constructed  around  the  terminal  point  of  the  null  hypothesis  linked  vector. 

2.3.2. b  -  The  Linked  Vector 

To  obtain  the  linked  vectors,  consider: 

1  -  the  sample  data  set:  xi,  X2,  X3,...,  xn  with  sample  mean  px,  sample  standard  deviation 
Ox,  and  length  N, 

2  -  a  null  hypothesis  data  set  generated  from  any  available  distribution  against  which  the 
sample  set  will  be  tested:  zi,  Z2,  Z3,. . .,  zn  with  zero  mean,  unit  variance  and  length  N,  and 

3  -  an  auxiliary  data  set  generated  from  the  standard  Gaussian:  wi,  W2,  W3,...,  wn  with 
zero  mean,  unit  variance  and  length  N. 

Next,  reorder  all  data  sets  (ordered  statistics)  with  the  smallest  value  first: 

Xl:N»  X2:N,  X3:Nv>  Xn;N 

Zl:N,  Z2:N,  Z3:N,...,  ZN;N 

Wi:N,  W2:N,  W3;K,...,  WN:N 

Let,  yi:N,  for  the  sample  linked  vector  be  defined  as: 


yr.N 


(1) 


The  magnitude  of  the  sample  linked  vector  is  the  absolute  value  of  yi;N.  Also,  let  ti;N,  for 
the  null  hypothesis  be  defined  as  the  expected  value  of  the  i**"  ordered  statistic  of  the  null 
hypothesis  distribution: 
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ti;N=  E[Zi;N]. 


(2) 


The  magnitude  of  the  null  hypothesis  linked  vector  is  the  absolute  value  of  ti;N.  Finally, 
let  mi;N,  be  the  expected  value  of  the  i*  ordered  statistic  of  the  auxiliary  distribution  (the 
Gaussian  distribution  in  this  case); 

mi;N=  E[Wi;N].  (3) 

The  expected  values  are  obtained  through  a  Monte-Carlo  simulation  consisting  of  2000 
generated  data  sets  for  both  the  null  hypothesis  and  auxiliary  data  sets. 

The  set  of  angles  associated  with  each  linked  vector  is  defined  as: 

0,=7t<lim,f,)  (4) 

where: 

<Ka)  =  Jexp(-y)c?t.  (5) 

Thus,  the  angle  depends  on  the  value  of  the  Gaussian  distribution  fimction,  evaluated  at 
each  expected  value  of  the  reference  distribution  ordered  statistic. 

Next,  set  up  the  coordinate  system  Qk=[uk,Vk],  where: 

1  * 

k  =  1,2, 3,..., AT  (6) 

=Ti;k»|sinfl,;  t  =  . N  (7) 

and 

Qo=[Uo,Vo]=(0>0).  (8) 

for  the  sample  linked  vector.  The  null  hypothesis  linked  vector  is  obtained  by  replacing 
yi:N  with  ti;N  in  the  expressions  for  Uk  and  Vk  above. 

Note  that  the  angle  0i  is  solely  dependent  on  the  auxiliary  distribution  for  all 
linked  vectors,  while  the  magnitude  is  solely  dependent  on  the  data  chosen  for  the  linked 
vector  for  the  sample  data  and  null  hypothesis. 
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Furthermore  note  that  yi;N  and  ti:N  are  ordered  statistics  from  smallest  to  largest, 
while  the  magnitudes  of  yi;N  and  ti:N  are  no  longer  true  ordered  statistics,  due  to 
standardization.  Since  yi:N  and  ti.N  contain  negative  values  due  to  standardization,  then 
their  magnitudes  would  begin  large,  decrease  to  approximately  zero  and  then  increase 
again. 

Also,  when  the  length  N  of  the  sample  data  set  is  large  (on  the  order  of  50  points), 
then  the  linked  vector  is  a  smooth  arc. 


2.3.2.C  -  The  Confidence  Contours 

The  linked  vector  for  the  null  hypothesis  is  based  on  the  expected  values  of  the 
ordered  statistic  z^n  for  the  2000  Monte-Carlo  simulations.  Thus  if  one  considers  just  one 
point  along  the  linked  vector,  in  particular  the  end  point,  the  Monte-Carlo  simulation 
provides  2000  points  of  which  only  the  expected  value  is  plotted,.  However,  these  2000 
points  can  also  be  analyzed  for  their  distribution. 

The  confidence  contours  associated  with  the  end  point  of  the  null  hypothesis  are 
determined  by  fitting  a  three  dimensional  bell  shaped  (bivariate  Gaussian)  curve  to  the 
2000  points  arising  from  the  distribution  of  the  Monte-Carlo  end  points  for  the  null 
hypothesis  linked  vector.  The  contours  of  constant  density  of  this  distribution  are  then 
plotted  for  various  values  of  the  parameter  alpha,  (e.g.,  0.01,  0.05,  and  0.10),  where  alpha 
is  the  probability  that  the  end  point  falls  outside  the  specified  contour  given  that  the  data 
is  from  the  null  hypothesis  distribution.  Then  unity  minus  alpha  is  known  as  the 
confidence  contour.  Alpha  is  known  as  the  significance  level  of  the  test.  Also,  1 -Alpha,  is 
referred  to  as  the  confidence  level  of  the  test. 

This  may  be  repeated  for  any  N  points  of  the  ordered  statistic,  Zi;N,  along  the  null 
hypothesis  linked  vector.  If  the  sample  data  is  truly  consistent  with  the  null  hypothesis, 
then  the  sample  data’s  linked  vector  trajectory  will  pass  through  a  series  of  hoops  defined 
by  the  confidence  contours.  Because  the  human  eye  can  readily  detect  whether  or  not  the 
linked  vectors  are  closely  following  the  same  trajectory,  only  the  last  set  of  confidence 
contours  are  typically  used. 
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As  the  significance  level  of  the  test  increases,  the  corresponding  confidence  level 
decreases  and  the  confidence  contours  decrease  in  size.  The  closer  the  end  point  of  the 
linked  vector  for  the  sample  data  falls  to  the  center  of  the  confidence  contours,  the  more 
likely  it  is  that  the  sample  is  from  the  null  hypothesis. 

Also,  for  a  given  sample  size  N,  note  that  the  i*  angle  which  is  dependent  solely 
on  the  auxiliary  distribution  remains  unchanged  and  is  used  for  all  of  the  i*  linked 
vectors.  Also,  the  magnitude  of  the  sample  data  linked  vector  is  solely  dependent  on  the 
sample  data  set. 

An  example  of  the  goodness-of-fit  test  is  shown  in  Figure  2.4.  The  confidence 
contours  are  plotted  for  confidence  contours  of  0.9, 0.95,  and  0.99.  As  mentioned  above, 
if  the  end  point  of  the  sample  data  linked  vector  locus  falls  within  a  contour,  then  the 
sample  data  set  is  said  to  be  statistically  consistent  with  the  null  hypothesis  at  a 
confidence  level  based  on  the  probability  specified  for  that  contour.  If  the  sample  data  set 
is  truly  consistent  with  the  null  hypothesis,  the  system  of  sample  linked  vectors  is  likely 
to  closely  follow  that  for  the  system  of  null  linked  vectors. 


Figure  2.4  -  Example  of  Goodness-of-fit  Test 
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2.3.2.d  -  PDF  Approximation  Mode 

The  PDF  approximation  chart  takes  this  a  step  further  by  providing  other 
distributions.  These  distributions  are  computed  in  the  same  manner  as  described 
previously  for  the  null  hypothesis  in  that  the  magnitude  of  the  linked  vector  is  computed 
from  the  expected  value  of  the  ordered  statistic  of  2000  Monte-Carlo  simulations. 
However,  the  angles  ^  are  still  computed  from  the  auxiliary  distribution,  and,  the 
confidence  contours  are  computed  only  for  the  null  hypothesis. 

In  order  not  to  clutter  the  approximation  chart,  only  the  end  points  of  all  linked 
vectors  are  provided  in  the  approximation  chart,  along  with  the  confidence  contours  for 
the  selected  distribution  (null  hypothesis). 

For  distributions  dependent  only  on  mean  and  variance  (no  shape  parameters), 
such  as  Gaussian,  Uniform,  and  Cauchy,  there  exists  only  one  imique  linked  vector  (for  a 
given  value  of  N)  since  the  data  are  normalized  to  have  zero  mean  and  imit  variance. 
Thus,  only  one  point  on  the  approximation  chart  is  plotted.  For  distributions  dependent 
on  a  single  shape  parameter,  such  as  Weibull,  Lognormal,  and  K-distributed,  different 
values  of  the  shape  parameter  result  in  different  linked  vectors.  Consequently,  the  end 
point  of  the  linked  vectors  is  also  dependent  on  the  shape  parameter.  The  end  points 
corresponding  to  different  shape  parameter  values  are  joined  to  obtain  a  single  curve  on 
the  approximation  chart.  This  curve  provides  a  unique  representation  for  the  PDF 
dependent  on  a  single  shape  parameter  for  a  given  value  of  N.  Similarly,  for  a  distribution 
dependent  on  two  shape  parameters,  such  as  the  Beta  distribution  and  SU-Johnson,  a 
series  of  linked  vectors  must  be  computed  in  order  to  plot  the  surface  on  which  the  end 
point  moves  for  varying  shape  parameters.  This  is  formed  by  holding  the  first  shape 
parameter  constant  and  varying  the  second  shape  parameter  to  generate  a  curve,  then 
changing  the  first  shape  parameter  and  again  holding  it  constant  while  varying  the  second 
shape  parameter,  etc.  imtil  a  family  of  curves  is  produced  over  the  surface  that  the 
distribution  occupies  (for  a  given  value  of  N). 

The  approximation  chart  is  used  by  plotting  on  the  chart  the  end  point  of  the 
sampled  linked  vector  yiiN-  To  select  the  ‘best’  approximate  PDF,  the  algorithm  chooses 
the  closest  distribution  to  the  end  point  and  estimates  the  shape  parameters  of  this 
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distribution  if  required.  The  algorithm  can  also  provide  a  rank  order  of  selection  for  PDF 
approximation  of  all  available  distributions  based  on  their  respective  distances  from  the 
sample  data. 

Figure  2.5  shows  an  example  of  the  approximation  chart.  Note  that  every  point  in 
the  approximation  chart  corresponds  to  a  specific  distribution.  That  point  closest  to  the 
sample  data  locus  end  point  is  chosen  as  the  best  approximation  to  the  PDF  underlying 
the  random  data.  This  closest  point  is  determined  by  projecting  the  sample  locus  end 
point  to  all  points  on  the  approximation  chart  and  selecting  that  point  whose 
perpendicular  distance  from  the  sample  point  is  the  smallest.  Once  the  PDF  underlying 
the  sample  data  is  selected,  the  shape,  location  and  scale  parameters  are  then 
approximated. 


2.3.3  -  Details  of  the  statistical  procedure 

Recall  that  the  Ozturk  algorithm  not  only  approximates  the  body  of  the  PDF  for  a 
given  set  of  data  but  also  gives  confidence  ellipses  to  determine  the  qtiality  of  the 
approximation.  One  can  choose  different  PDFs  such  that  their  end  points,  defined  by 
[uioo,vioo]»  and  their  corresponding  0.99  ellipses  are  spread  throughout  the  approximation 
chart.  By  doing  so,  one  can  define  regions  in  the  Approximation  Chart  bovmded  by  the 
ellipses  where  each  region  hosts  the  end-points  of  the  data  that  would  be  generated  from 
the  corresponding  PDF.  In  this  case,  whenever  the  end  point  of  a  given  set  of  sample  data 
falls  within  a  given  ellipse  it  will  be  assigned  the  PDF  corresponding  to  that  ellipse  as  the 
approximating  PDF.  This  technique  is  detailed  next. 

Set-up  of  the  Approximation  Chart 

Consider  a  set  of  15  PDFs,  as  tabulated  and  numbered  in  Table  2.1  along  with 
their  shape  parameters.  The  end  points  and  the  99%  confidence  ellipses  corresponding  to 
the  PDFs  are  displayed  in  the  UV  plane  of  the  approximation  chart,  as  shown  in  Figures 
2.6  and  2.7,  respectively.  Figure  2.8  shows  the  colors  assigned  to  the  ellipses  of  the 
different  PDF s  as  listed  in  Table  2.1 
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Figure  2.5  -  Example  of  the  Approximation  Chart  (U:Uniform,  C:Cauchy,  N:NonnaI, 
K:K-distributed,  L:Lognormal,  W:Weibull) 
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Table  2.1  -  List  of  PDFs  whose  ellipses  are  used  in  the  Approximation 
Chart  and  their  corresponding  colors. 

N:  Normal,  W:  Weibull,  B:  Beta,  SU:  SU-Johnson 
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Figure  2.6  -  End-Points  location  of  the  set  of  15  PDFs 


Figure  2.7  -  99%  Ellipses  of  the  set  of  15  PDFs 
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Figure  2.8  -  Chart  for  the  ellipses  of  the  different  PDFs 


Set-uD  of  the  scene 

'Z  Consider  now  a  scene  consisting  of  MxN  pixels.  Let  the  pair  (ij)  designate  the 
location  of  a  given  pixel  in  the  scene  referred  to  as  the  test  pixel.  Each  pixel  (iJ)  in  the 
scene  has  a  data  value  of  v(i,j). 

Z  For  each  test  pixel  select  a  set  of  99  reference  pixels  which  are  the  closest  in 
location  to  the  test  pixel.  Let  the  set  of  100  data  from  the  reference  pixels  and  the  test 
pixel  be  referred  to  as  the  sample  data  set  for  the  test  pixel.  Note  that  in  this  work,  100 
data  samples  are  used  each  time  to  determine  the  best  approximating  PDF  of  the  samples 
using  the  Ozturk  algorithm. 


Methodology 

Z  For  a  given  test  pixel  obtain  its  corresponding  sample  data  set. 

Z  Compute  the  coordinate  Qioo=[uioo,vioo],  as  defined  by  Eqs.  6-8  corresponding 
to  the  data  set. 


/  Check  in  which  ellipse  Qioo  located. 

•/  Record  the  number  from  Table  1  assigned  to  the  PDF  corresponding  to  that 

ellipse. 

/  Assign  that  number  to  the  test  pixel. 

/  Repeat  the  above  steps  for  each  pixel  in  the  scene. 

At  the  end,  a  mapped  scene  is  obtained  where  each  pixel  would  have  a  value 
between  1  and  15  referring  to  the  PDF  which  would  best  approximate  the  data  in  the  test 
pixel  when  it  is  associated  with  its  closest  99  reference  pixels.  The  mapped  scene  will 
show  “regions”  with  different  values.  These  regions  will  define  the  homogeneous  patches 
in  the  scene. 
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Chapter  III 

Application  of  A'SCAPE  to  Medical  Imaging 


3.1  -  Introduction 

The  A'SCAPE  procedure  is  applied  next  to  real  data  of  CT  scans.  First,  one 
example  of  a  known  benign  case  and  a  second  example  of  a  known  malignant  case  are 
used  to  build  rules  to  identify  the  tumors  and  classify  them.  Then,  the  rules  are  tested  by 
applying  them  to  three  different  cases  of  unknown  types  of  tumor.  Note  that,  because 
only  one  sample  from  each  case  (malignant  and  benign)  is  used  to  build  the  rules,  these 
are  not  necessary  well  justified  and  may  be  erroneous.  On  the  other  hand,  these  rules 
serve  the  objective  of  showing  how  to  approach  the  problem  of  detecting  and  classifying 
tumors  using  the  A'SCAPE  procedure. 

The  Mapping  procedure  is  used  to  isolate  all  regions  that  might  be  identified  as 
tumors.  This  is  done  by  clearing  all  non-desirable  patches  such  as  branches  and  vessels 
from  the  scene.  Among  the  patches  left  in  the  scene,  those  with  more  then  100  pixels  are 
then  identified  to  be  processed  later  by  the  Statistical  procedure.  The  latter  procedure  is 
used  to  both  identify  and  classify  tumors. 

In  the  following  sections,  examples  of  the  application  of  A'SCAPE  are  shown. 
First  in  Sections  3.2  and  3.3,  A'SCAPE  is  applied  to  known  cases  of  benign  and 
malignant  tumors.  Then,  in  Section  3.4,  rules  are  built  to  help  identify  and  classify 
tumors.  Finally,  in  Sections  3. 5-3. 7,  the  rules  are  tested  on  examples  of  cases  of  lungs 
with  unknown  types  of  tumor. 

3.2  -  Example  1:  Benign  Case  Number  E20799S2I1 

Consider  the  case  of  a  limg  with  a  Benign  tumor  shown  in  the  lower  limg  of 
Figure  3.1.  Note  that  the  original  scene  is  a  512  by  512  image.  First,  the  lung  with  the 
tumor  is  selected  as  shown  in  Figure  3.2.  Note  that  this  scene  has  130  by  149  pixels.  The 
tumor  is  shown  by  the  patch  inside  the  limg  located  in  the  bottom  left  of  the  image. 
Application  of  the  Mapping  procedure  results  in  5  thresholds  generating,  thus,  5 
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component  images  whose  composite  image  is  shown  in  Figure  3.3  with  5  different  gray 
levels.  The  component  image  that  contains  the  tumor  is  selected  as  shown  in  Figure  3.4. 
The  component  image  is  then  labeled  to  result  in  the  labeled  scene  of  Figure  3.5.  Note 
that,  as  shown  in  Table  3.1,  only  patch  number  3  meets  the  condition  that  it  is  inside  the 
lung  and  contains  more  than  100  pixels.  This  patch  is  in  fact  the  tumor  to  be  investigated. 
Table  3.2  summaries  the  type  of  approximating  PDFs  for  each  pixel  in  patch  3  as  well  as 
the  number  of  pixels  with  the  same  approximating  PDF. 

Processing  patch  number  3  through  the  Statistical  Procedure  results  in  the  colored 
scene  of  Figure  3.6.  Note  that  the  different  colors  in  the  scene  refer  to  the  PDFs  of  the 
corresponding  pixels  as  listed  in  Table  2.1.  Also,  Figure  3.7  shows  in  black  the  spread  of 
the  UV  pairs  corresponding  to  the  pixels  of  patch  number  3.  The  following  are  to  be 
noted: 

O  In  Figure  3.6, 

/  two  distinct  contiguous  regions  can  be  seen:  Region  1  formed  by  pixels 
with  UV  pairs  outside  the  ellipses,  and.  Region  2  formed  by  pixels  from  the  Beta  PDF 
surrounding  other  pixels  from  the  SU-Johnson  PDF 

✓  concentric  behavior  is  displayed  by  the  pixels  from  the  Beta  PDF 
surrounding  other  pixels  from  the  SU-Johnson  PDF 

O  In  Figure  3.7, 

’Z  UV  pairs  are  located  on  the  right  part  of  the  chart 

Z  UV  pairs  are  wide  spread 

Z  UV  pairs  are  spread  vertically 
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Figure  3.1  -  E20799S2I1-  Original  Scene 
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Figure  3.2  -  E20799S2I1-  Lung  with  Tumor 
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Figure  3.4  -  E20799S2I1-  Component  Scene  Containing  Tumor 
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Figure  3.6  -  E20799S2I1  -  Patch  No.  3  with  Approximated  PDFs 
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Table  3.1  -  E20799S2I1  -  Results  of  labeling 
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Table  3.2  -  E20799S2I1  -  PDFs  of  the  pixels  in  patch  3 
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3.3  -  Example  2:  Malignant  Case  Number  E14649S2I14 

Consider  now  the  case  of  a  lung  with  a  Malignant  tumor  shown  in  the  upper  lung 
of  Figure  3.8.  Note  that  the  original  scene  is  a  512  by  512  image.  As  was  done  in 
example  1,  the  limg  with  the  tumor  is  selected  as  shown  in  Figure  3.9.  Note  that  this 
scene  has  278  by  191  pixels.  The  tumor  is  shown  by  the  patch  inside  the  lung  located  in 
the  upper  left  of  the  image.  Application  of  the  Mapping  procedure  results  in  6  thresholds 
generating,  thus,  6  component  images  whose  composite  image  is  shown  in  Figure  3.10 
with  6  different  gray  levels.  The  component  image  that  contains  the  tumor  is  selected  as 
shown  in  Figure  3.1 1.  The  component  image  is  then  labeled  to  result  in  the  labeled  scene 
of  Figure  3.12.  Note  that,  as  shown  in  Table  3.3,  only  patch  numbered  2  is  inside  the  lung 
and  has  more  than  100  pixels.  This  patch  is  in  fact  the  tumor  to  be  investigated.  Table  3.4 
summaries  the  type  of  approximating  PDFs  for  each  pixel  in  patch  2  as  well  as  the 
number  of  pixels  with  the  same  approximating  PDF. 

Processing  patch  numbered  2  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.13.  Note  again  that  the  different  colors  in  the  scene  refer  to  the 
PDFs  of  the  corresponding  pixels  as  listed  in  Table  2.1.  Also,  Figure  3.14  shows  in  black 
the  spread  of  the  UV  pairs  corresponding  to  the  pixels  of  patch  numbered  2.  The 
following  are  to  be  noted: 

O  In  Figure  3.13, 

/  two  distinct  contiguous  regions  can  be  seen:  Region  1  formed  by  pixels 
with  UV  pairs  in  a  particular  Beta’s  ellipse,  and.  Region  2  formed  by  pixels  from  another 
Beta  PDF 

□  in  Figure  3.14, 

/  UV  pairs  are  located  on  the  right  part  of  the  chart 
/  UV  pairs  are  wide  spread 
/  UV  pairs  are  spread  horizontally 
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Figure  3.8  -  E14649S2I14  -  Original  Scene 
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Figure  3.9  -  E14649S2I14  -  Lung  with  Tumor 
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Figure  3.10  -  E14649S2I14  -  Composite  Scene 
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Figure  3.11  -  E14649S2I14  -  Component  Scene  Containing  Tumor 
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Figure  3.12  -  E14649S2114  -  Labeled  Scene 


Figure  3.13  -  E14649S2I14  -  Patch  No.  2  with  Approximated  PDFs 
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Table  3.3  -  E14649S2I14  -  Results  of  labeling 
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Table  3.4  -  E14649S2I14  -  PDFs  of  the  pixels  in  patch  3 
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3.4  -  Rules  deducted  from  examples  1  and  2 

Using  the  notes  on  Figures  3.6,  3.7,  3.13,  3.14,  on  the  Benign  and  Malignant 
tumors,  the  following  rules  are  deducted: 

Rule  number  1:  A  patch  is  declared  to  be  a  Tumor  when 
O  UV  pairs  are  located  on  the  right  part  of  the  chart 
O  UV  pairs  are  spread 

Rule  number  2:  A  patch  is  declared  to  be  a  Benign  Tumor  when 
O  UV  pairs  are  spread  vertically 
□  Concentric  regions  are  displayed  in  the  colored  patch 

Rule  number  3:  A  patch  is  declared  to  be  a  Malignant  Tvunor  when 
O  UV  pairs  are  spread  horizontally 
O  Contiguous  regions  are  displayed  in  the  colored  patch 


It  is  important  to  mention  again  that  the  number  of  samples  upon  which  the  rules 
are  based  is  very  limited  and,  thus,  the  rules  might  not  be  valid.  Using  the  data  at  hand 
and  the  above  rules,  three  cases  of  unknown  types  of  tumor  are  investigated  next.  In  each 
case,  even-though  the  tumor  location  is  known  a-priori,  its  type  (Malignant  or  Benign)  is 
unknown.  Note  that  for  each  case  (1)  the  Mapping  procedure  segments  the  scene,  cleans 
out  branches,  and  detects  patches  that  might  be  tumors.  Then,  (2)  for  each  patch  with  at 
least  100  pixels,  the  Statistical  procedure  finds  the  PDFs  that  can  approximate  the 
statistical  distribution  for  the  different  pixels,  and,  assigns  different  colors  for  pixels  with 
different  approximating  PDFs.  Finally,  (3)  the  above  rules  are  applied  to  each  patch  and 
decisions  are  made  to  answer  the  following  questions: 

(1)  is  the  patch  a  tumor, 

(2)  if  the  patch  is  a  tumor,  is  it  Malignant  or  Benign. 
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3.5  -  Example  3:  Unknown  Case  Number  E7066S2I17 

Consider  first  the  case  of  a  lung  with  a  tumor  whose  type  is  unknown.  The  tumor 
is  located  in  the  lower  lung  of  Figure  3.15.  Note  that  the  original  scene  is  a  512  by  512 
image.  As  was  done  in  examples  1  and  2,  the  lung  with  the  tumor  is  selected  as  shown  in 
Figure  3.16.  Note  that  this  scene  has  218  by  284  pixels.  The  tumor  is  shown  by  the  patch 
inside  the  lung  located  in  the  lower  right  of  the  image.  Application  of  the  Mapping 
procedure  results  in  the  composite  image  of  Figure  3.17.  The  component  image  that 
contains  the  tumor  is  selected  as  shown  in  Figure  3.18.  The  patches  in  the  component 
image  are  then  labeled  to  result  in  the  scene  of  Figure  3.19.  Note  that,  as  shown  in  Table 
3.5,  patches  numbered  3,  4,  and  10  are  inside  the  lung  and  have  more  than  100  pixels 
each.  Patch  numbered  10  is  the  tumor.  For  each  patch.  Table  3.6  summaries  the  type  of 
approximating  PDFs  for  each  pixel  as  well  as  the  number  of  pixels  with  the  same 
approximating  PDF. 

Processing  patch  numbered  3  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.20.  Also,  Figure  3.21  shows  in  black  the  spread  of  the  UV  pairs 
corresponding  to  the  pixels  of  patch  numbered  3.  The  following  are  to  be  noted: 

O  In  Figure  3,20, 

/  most  of  the  pixels  have  the  same  color  (e.g.,  PDF),  no  contiguous  or 
concentric  regions  are  observed 

O  In  Figure  3.21, 

/  UV  pairs  are  located  on  the  left  part  of  the  chart 
■/  UV  pairs  are  not  wide  spread 

With  respect  to  rules  1  to  3  of  Section  3.3,  it  is  noted  that  (1)  the  UV  pairs  are  located  on 
the  left  part  of  the  chart  and  not  on  the  right  part,  and  that  they  are  not  spread,  and,  (2)  the 
colored  patch  does  not  display  any  contiguous  or  concentric  behavior  of  colors.  Thus,  it  is 
concluded  that  the  patch  is  not  a  tumor. 

Processing  patch  numbered  4  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.22.  Also,  Figure  3.23  shows  in  black  the  spread  of  the  UV  pairs 
corresponding  to  the  pixels  of  patch  numbered  4.  The  following  are  to  be  noted: 
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□  In  Figure  3.22, 

/  all  pixels  have  the  same  PDF,  no  contiguous  or  concentric  regions  are 
observed 

O  In  Figure  3.23, 

•/  UV  pairs  are  located  on  the  middle  part  of  the  chart 
/  UV  pairs  are  not  wide  spread 

With  respect  to  rules  1  to  3  of  Section  3.3,  it  is  noted  that  (1)  the  UV  pairs  are  located  on 
the  middle  part  of  the  chart  and  not  on  the  right  part,  and  that  they  are  not  spread,  and,  (2) 
the  colored  patch  does  not  display  anv  contiguous  or  concentric  behavior  of  colors.  Thus, 
it  is  concluded  that  the  patch  is  not  a  tumor. 

Processing  patch  numbered  10  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.24.  Also,  Figure  3.25  shows  in  black  the  spread  of  the  UV  pairs 
corresponding  to  the  pixels  of  patch  numbered  10.  The  following  are  to  be  noted: 

□  In  Figure  3.24, 

/  concentric  regions  are  observed  in  the  colored  patch 
O  In  Figure  3.25, 

/  UV  pairs  are  located  on  the  right  part  of  the  chart 
/  UV  pairs  are  wide  spread 

With  respect  to  rules  1  to  3  of  Section  3.3,  it  is  noted  that  (1)  because  the  UV  pairs  are 
located  on  the  right  part  of  the  chart  and  that  they  are  vyide  spread,  it  is  concluded  that 
the  patch  is  a  tumor.  Also,  (2)  because  the  colored  patch  displays  a  concentric  behavior 
of  colors,  it  is  concluded  that  the  patch  is  a  Benign  tumor. 

Note  that  in  this  example  all  patches  were  correctly  identified  with  respect  to 
containing  a  tumor. 
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Figure  3.15  -  E7066S2I17  -  Original  Scene 
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Figure  3.16  -  E7066S2I17  -  Lung  with  Tumor 


32 


Figure  3.17  -  E7066S2I17  -  Composite  scene 


Figure  3.18  -  E7066S2I17  -  Component  Scene  Containing  Tumor 


Figure  3.20  -  E7066S2I17  -  Patch  No.  3  with  Approximated  PDFs 
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Figure  3.22  -  E7066S2I17  -  Patch  No.  4  with  Approximated  PDFs 
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Figure  3.23  -  E7066S2I17  -  Location  of  UV  pairs  for  Patch  No.  4 


Figure  3.24  -  E7066S2I17  -  Patch  No.  10  with  Approximated  PDFs 
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Table  3.5  -  E7066S2I17  -  Results  of  labeling 
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Table  3.6  -  E7066S2I17  -  PDFs  of  the  pixels  in  patches  3, 4,  and  10 


38 


3.6  —  Example  4:  Unknown  Case  Number  E5424S2I15 

Consider  the  case  of  a  limg  with  a  tumor  whose  type  is  unknown.  The  tumor  is 
located  in  the  lower  lung  of  Figure  3.26.  Note  that  the  original  scene  is  a  512  by  512 
image.  As  was  done  in  the  previous  examples,  the  lung  with  the  tumor  is  first  selected  as 
shown  in  Figure  3.27.  Note  that  the  scene  has  been  reduced  to  197  by  264  pixels.  The 
tumor  is  shown  by  the  patch  inside  the  lung  located  in  the  lower  right  of  the  image. 
Application  of  the  Mapping  procedure  results  in  the  composite  image  of  Figure  3.28.  The 
component  image  that  contains  the  tumor  is  selected  as  shown  in  Figure  3.29.  The 
patches  in  the  component  image  are  then  labeled  to  result  in  the  scene  of  Figure  3.30. 
Note  that,  as  shown  in  Table  3.7,  patches  numbered  5,  7,  and  13  are  inside  the  lung  and 
have  more  than  100  pixels  each.  Patch  numbered  13  is  the  tumor.  For  each  patch.  Table 
3.8  summaries  the  type  of  approximating  PDFs  for  each  pixel  as  well  as  the  number  of 
pixels  with  the  same  approximating  PDF. 

Processing  patch  numbered  5  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.31.  Also,  Figure  3.32  shows  in  black  the  spread  of  the  UV  pairs 
corresponding  to  the  pixels  of  patch  numbered  5.  The  following  are  to  be  noted: 

O  In  Figure  3.31, 

y  concentric  regions  are  observed  in  the  colored  patch 
O  In  Figure  3.32, 

y  UV  pairs  are  located  on  the  right  part  of  the  chart 
y  UV  pairs  are  wide  spread 
y  U V  pairs  are  spread  vertically 

With  respect  to  rules  1  to  3  of  Section  3.3,  it  is  noted  that  (1)  because  the  UV  pairs  are 
located  on  the  right  part  of  the  chart  and  that  they  are  wide  spread,  it  is  concluded  that 
the  patch  is  a  tumor.  Also,  (2)  because  the  colored  patch  displays  a  concentric  behavior 
of  colors,  and  that  the  UV  pairs  are  spread  vertically,  it  is  concluded  that  the  patch  is  a 
Benign  tumor. 
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Processing  patch  numbered  7  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.33.  Also,  Figure  3.34  shows  in  black  the  spread  of  the  UV  pairs 
corresponding  to  the  pixels  of  patch  numbered  7.  The  following  are  to  be  noted: 

In  Figure  3.33, 

/  most  of  the  pixels  have  the  same  PDF,  no  contiguous  or  concentric 
regions  are  observed 

O  In  Figure  3.34, 

/  UV  pairs  are  located  on  the  middle  part  of  the  chart 
/  UV  pairs  are  not  wide  spread 

With  respect  to  rules  1  to  3  of  Section  3.3,  it  is  noted  that  (1)  the  UV  pairs  are  located  on 
the  middle  part  of  the  chart  and  not  on  the  right  part,  and  that  they  are  not  spread,  and,  (2) 
the  colored  patch  does  not  display  any  contiguous  or  concentric  behavior  of  colors.  Thus, 
it  is  concluded  that  the  patch  is  not  a  tumor. 

Processing  patch  numbered  13  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.35.  Also,  Figure  3.36  shows  in  black  the  spread  of  the  UV  pairs 
corresponding  to  the  pixels  of  patch  numbered  13.  The  following  are  to  be  noted: 

O  In  Figure  3.35, 

/  concentric  regions  are  observed  in  the  colored  patch 

/  contiguous  regions  are  observed  in  the  upper  right  part  of  the  colored 
patch 

O  In  Figure  3.36, 

/  UV  pairs  are  located  mostly  on  the  right  part  of  the  chart 
/  UV  pairs  are  vyide  spread 

With  respect  to  rules  1  to  3  of  Section  3.3,  it  is  noted  that  (1)  because  the  UV  pairs  are 
located  on  the  right  part  of  the  chart  and  that  they  are  wide  spread,  it  is  concluded  that 
the  patch  is  a  tumor.  Also,  (2)  because  the  colored  patch  displays  a  concentric  behavior 
of  colors,  it  is  concluded  that  the  patch  is  a  Benign  tumor.  In  addition  (3)  the  upper  part 


40 


of  the  chart  displays  a  contiguous  behavior  of  colors,  it  is  also  concluded  that  the  upper 
part  of  the  patch  is  a  Malignant  tumor. 

Note  that  in  this  example  not  all  patches  were  correctly  identified  with  respect  to 
containing  a  tumor.  Namely,  patch  3  was  identified  as  containing  a  tumor;  in  fact,  it  did 
not. 
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Figure  3.26  >  E5424S2I15  -  Original  Scene 


Figure  3.27  -  E5424S2I15  -  Lung  with  Tumor 
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Figure  3.28  -  E5424S2I15  -  Composite  scene 
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Figure  3.29  -  E5424S2I15  -  Component  Scene  Containing  Tumor 
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Figure  3.30  -  E5424S2I15  -  Labeled  Scene 
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Figure  3.31  -  ES424S2I15  -  Location  of  UV  pairs  for  Patch  No.  5 
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Figure  3.34  -  E5424S2I15  -  Location  of  UV  pairs  for  Patch  No.  7 


Figure  3.35  -  E5424S2I15  -  Patch  No.  13  with  Approximated  PDFs 
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Table  3.7  -  E7066S2I17  -  Results  of  labeling 
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Table  3.8  -  E7066S2I17  -  PDFs  of  the  pixels  in  patches  5, 7,  and  13 
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3.7  -  Example  5:  Unknown  Case  Number  E18642s2il9 

Consider  the  last  case  of  a  lung  with  a  tumor  whose  type  is  unknown.  The  tumor 
is  located  in  the  lower  lung  of  Figure  3.37.  Note  that  the  original  scene  is  a  512  by  512 
image.  As  was  done  in  the  previous  examples,  the  lung  with  the  tumor  is  first  selected  as 
shown  in  Figure  3.38.  Note  that  the  scene  has  been  reduced  to  147  by  174  pixels.  The 
tumor  is  shown  by  the  patch  inside  the  lung  located  in  the  lower  middle  of  the  image. 
Application  of  the  Mapping  procedure  results  in  the  composite  image  of  Figure  3.39.  The 
component  image  that  contains  the  tumor  is  selected  as  shown  in  Figure  3.40.  The 
patches  in  the  component  image  are  then  labeled  to  result  in  the  scene  of  Figure  3.41. 
Note  that,  as  shown  in  Table  3.9,  patch  numbered  4  is  the  only  patch  inside  the  lung  that 
has  more  than  100  pixels  each.  Patch  4  is  the  tumor  to  be  investigated.  Table  3.10 
summaries  the  type  of  approximating  PDFs  for  each  pixel  in  patch  4  as  well  as  the 
number  of  pixels  with  the  same  approximating  PDF. 

Processing  patch  numbered  4  through  the  Statistical  Procedure  results  in  the 
colored  scene  of  Figure  3.42.  Also,  Figure  3.43  shows  in  black  the  spread  of  the  UV  pairs 
corresponding  to  the  pixels  of  patch  numbered  5.  The  following  are  to  be  noted: 

O  In  Figure  3.42, 

/  most  of  the  pixels  have  the  same  PDF,  no  contiguous  or  concentric 
regions  are  observed 

O  In  Figure  3.43, 

•/  UV  pairs  are  located  on  the  middle  part  of  the  chart 
/  UV  pairs  are  wide  spread 
y  UV  pairs  are  spread  vertically 

With  respect  to  rules  1  to  3  of  Section  3.3,  it  is  noted  that 

(1)  the  UV  pairs  are  located  on  the  middle  part  of  the  chart  and  not  the 
right  part.  On  the  other  hand,  the  UV  pairs  are  wide  spread.  Note  that 
here,  only  part  of  rule  1  is  verified.  Thus,  no  conclusion  can  be  made 
about  whether  the  patch  is  a  tumor.  Also, 
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(2)  the  colored  patch  does  not  display  any  contiguous  or  concentric 
behavior  of  colors.  On  the  other  hand,  the  UV  pairs  are  vertically 
spread.  Thus,  only  part  of  rule  2  for  a  benign  tumor  is  verified.  As  a 
result,  no  conclusion  can  be  made  about  whether  the  patch  is  a 
benign  tumor. 

Note  that  in  this  example  that  beeause  only  parts  of  rules  1  and  2  apply,  it  was  not 
possible  to  get  a  definite  conclusion  about  whether  the  patch  is  a  tumor  and  whether  it  is 
Benign. 
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Figure  3.37  -  E18642s2il9  -  Original  Scene 
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Figure  3.38  -  E18642s2il9  -  Lung  with  Tumor 
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Figure  3.39  -  E18642s2il9  -  Composite  scene 
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Figure  3.40  -  E18642s2il9  -  Component  Scene  Containing  Tumor 
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Figure  3.42  -  E18642s2il9  -Patch  No.  4  with  Approximated  PDFs 
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Chapter  FV 
Conclusion 


4.1  -  Introduction 

Recall  that  the  Goals  of  the  research  are  to, 

-  detect  Tumors  located  in  lungs, 

-  classify  Tumors  as  Malignant  or  Benign, 

-  detect  Tumors  at  very  early  stages . 


In  this  part  of  the  research  the  two  first  goals  were  particularly  targeted.  The 
approach  consisted  of  using  (1)  using  the  Mapping  Procedure  of  A’SCAPE  to  partition 
and  clean  the  scene,  and  (2)  the  Statistical  Procedure  of  A’SCAPE  to  classify  a  patch  as  a 
tiimor  or  not  (detection),  and,  if  it  is  a  tumor,  to  determine  if  it  is  Malignant  or  Benign 
(classification). 

4.2  -  Summary  of  the  results 

Using  only  one  sample  of  a  known  Benign  case  and  another  sample  of  a  known 
Malignant  case,  rules  were  built  to  detect  and  classify  tumors.  The  rules  were  applied  of  a 
limited  test  case  library  of  3  samples.  The  results  were  presented  at  the  Cornell  Medical 
Center  and  the  following  answers  were  provided  to  the  questions  raised  by  this  research: 

(1)  The  Mapping  procedure  is  doing  a  good  job  by  cleaning  and  “discarding” 
unneeded  information  such  as  branches. 

(2)  The  Mapping  procedure  needs  to  keep  detailed  information  about  the  edges  of 
the  patches  left  in  the  scene.  Note  that  one  of  the  discriminates  that 
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radiologists  use  to  classify  tumors  as  Benign  or  Malignant  is  how  smooth  or 
sharp  are  the  edges. 

(3)  The  composite  image  of  a  given  scene  provides  more  information  about  the 
scene  then  the  raw  image. 

(4)  Lung  details  can  be  obtained  with  higher  resolution  by  zooming  on  regions  of 
interest  during  CT  scanning.  This,  particularly,  solves  the  problem  of  the 
limitation  in  size  of  the  patches  where  a  minimum  of  100  samples  are  needed 
by  this  study  to  enable  the  functioning  of  the  Statistical  procedure. 

(5)  A  larger  library  will  be  provided  for  rules-building  and  rules-testing.  Namely, 
a  minimum  of  20  cases  will  be  provided  for  rules-building  and  another 
minimum  of  20  cases  for  rules-testing. 

(6)  Instead  of  trying  to  classify  tumors  as  Benign  or  Malignant,  it  will  be  more 
profitable  for  the  radiologists  to  provide  them  Avith  "descriptors"  that  would 
strengthen  their  decisions.  Such  descriptors  can  be  the  rules  built  in  Section 
3.4.  This  is  because  classifying  a  tumor  is  a  difficult  and  sometimes  an 
unsolvable  problem.  As  an  example,  radiologists  can  look  at  the  tumor  to  the 
cell  level  without  being  able  to  decide  whether  it  is  Benign  or  Malignant. 


Study  of  this  problem  will  continue  once  additional  data  is  received.  In  addition, 
the  goals  will  be  tailored  according  to  the  notes  above,  and  cases  of  early  stages  of 
tumors  will  be  studied. 
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